Picture for Longteng Guo

Longteng Guo

Institute of Automation, Chinese Academy of Sciences, School of Artificial Intelligence, University of Chinese Academy of Sciences

Can MLLMs Reason Beyond Language? VisReason: A Comprehensive Benchmark for Vision-Centric Reasoning

Add code
May 25, 2026
Viaarxiv icon

Semantic-Enriched Latent Visual Reasoning

Add code
May 19, 2026
Viaarxiv icon

When Robots Do the Chores: A Benchmark and Agent for Long-Horizon Household Task Execution

Add code
May 14, 2026
Viaarxiv icon

SciVQR: A Multidisciplinary Multimodal Benchmark for Advanced Scientific Reasoning Evaluation

Add code
May 11, 2026
Viaarxiv icon

M$^3$-VQA: A Benchmark for Multimodal, Multi-Entity, Multi-Hop Visual Question Answering

Add code
Apr 28, 2026
Viaarxiv icon

AdaSpark: Adaptive Sparsity for Efficient Long-Video Understanding

Add code
Apr 09, 2026
Viaarxiv icon

Thinking in Streaming Video

Add code
Mar 13, 2026
Viaarxiv icon

S1-MMAlign: A Large-Scale, Multi-Disciplinary Dataset for Scientific Figure-Text Understanding

Add code
Jan 01, 2026
Viaarxiv icon

UrbanNav: Learning Language-Guided Urban Navigation from Web-Scale Human Trajectories

Add code
Dec 10, 2025
Figure 1 for UrbanNav: Learning Language-Guided Urban Navigation from Web-Scale Human Trajectories
Figure 2 for UrbanNav: Learning Language-Guided Urban Navigation from Web-Scale Human Trajectories
Figure 3 for UrbanNav: Learning Language-Guided Urban Navigation from Web-Scale Human Trajectories
Figure 4 for UrbanNav: Learning Language-Guided Urban Navigation from Web-Scale Human Trajectories
Viaarxiv icon

Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward

Add code
Jun 05, 2025
Figure 1 for Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward
Figure 2 for Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward
Figure 3 for Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward
Viaarxiv icon